633 research outputs found
The Emergence of Canalization and Evolvability in an Open-Ended, Interactive Evolutionary System
Natural evolution has produced a tremendous diversity of functional
organisms. Many believe an essential component of this process was the
evolution of evolvability, whereby evolution speeds up its ability to innovate
by generating a more adaptive pool of offspring. One hypothesized mechanism for
evolvability is developmental canalization, wherein certain dimensions of
variation become more likely to be traversed and others are prevented from
being explored (e.g. offspring tend to have similarly sized legs, and mutations
affect the length of both legs, not each leg individually). While ubiquitous in
nature, canalization almost never evolves in computational simulations of
evolution. Not only does that deprive us of in silico models in which to study
the evolution of evolvability, but it also raises the question of which
conditions give rise to this form of evolvability. Answering this question
would shed light on why such evolvability emerged naturally and could
accelerate engineering efforts to harness evolution to solve important
engineering challenges. In this paper we reveal a unique system in which
canalization did emerge in computational evolution. We document that genomes
entrench certain dimensions of variation that were frequently explored during
their evolutionary history. The genetic representation of these organisms also
evolved to be highly modular and hierarchical, and we show that these
organizational properties correlate with increased fitness. Interestingly, the
type of computational evolutionary experiment that produced this evolvability
was very different from traditional digital evolution in that there was no
objective, suggesting that open-ended, divergent evolutionary processes may be
necessary for the evolution of evolvability.Comment: SI can be found at: http://www.evolvingai.org/files/SI_0.zi
Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients
While neuroevolution (evolving neural networks) has a successful track record
across a variety of domains from reinforcement learning to artificial life, it
is rarely applied to large, deep neural networks. A central reason is that
while random mutation generally works in low dimensions, a random perturbation
of thousands or millions of weights is likely to break existing functionality,
providing no learning signal even if some individual weight changes were
beneficial. This paper proposes a solution by introducing a family of safe
mutation (SM) operators that aim within the mutation operator itself to find a
degree of change that does not alter network behavior too much, but still
facilitates exploration. Importantly, these SM operators do not require any
additional interactions with the environment. The most effective SM variant
capitalizes on the intriguing opportunity to scale the degree of mutation of
each individual weight according to the sensitivity of the network's outputs to
that weight, which requires computing the gradient of outputs with respect to
the weights (instead of the gradient of error, as in conventional deep
learning). This safe mutation through gradients (SM-G) operator dramatically
increases the ability of a simple genetic algorithm-based neuroevolution method
to find solutions in high-dimensional domains that require deep and/or
recurrent neural networks (which tend to be particularly brittle to mutation),
including domains that require processing raw pixels. By improving our ability
to evolve deep neural networks, this new safer approach to mutation expands the
scope of domains amenable to neuroevolution
ES Is More Than Just a Traditional Finite-Difference Approximator
An evolution strategy (ES) variant based on a simplification of a natural
evolution strategy recently attracted attention because it performs
surprisingly well in challenging deep reinforcement learning domains. It
searches for neural network parameters by generating perturbations to the
current set of parameters, checking their performance, and moving in the
aggregate direction of higher reward. Because it resembles a traditional
finite-difference approximation of the reward gradient, it can naturally be
confused with one. However, this ES optimizes for a different gradient than
just reward: It optimizes for the average reward of the entire population,
thereby seeking parameters that are robust to perturbation. This difference can
channel ES into distinct areas of the search space relative to gradient
descent, and also consequently to networks with distinct properties. This
unique robustness-seeking property, and its consequences for optimization, are
demonstrated in several domains. They include humanoid locomotion, where
networks from policy gradient-based reinforcement learning are significantly
less robust to parameter perturbation than ES-based policies solving the same
task. While the implications of such robustness and robustness-seeking remain
open to further study, this work's main contribution is to highlight such
differences and their potential importance
Deep Innovation Protection: Confronting the Credit Assignment Problem in Training Heterogeneous Neural Architectures
Deep reinforcement learning approaches have shown impressive results in a
variety of different domains, however, more complex heterogeneous architectures
such as world models require the different neural components to be trained
separately instead of end-to-end. While a simple genetic algorithm recently
showed end-to-end training is possible, it failed to solve a more complex 3D
task. This paper presents a method called Deep Innovation Protection (DIP) that
addresses the credit assignment problem in training complex heterogenous neural
network models end-to-end for such environments. The main idea behind the
approach is to employ multiobjective optimization to temporally reduce the
selection pressure on specific components in multi-component network, allowing
other components to adapt. We investigate the emergent representations of these
evolved networks, which learn to predict properties important for the survival
of the agent, without the need for a specific forward-prediction loss
Evolving Static Representations for Task Transfer
An important goal for machine learning is to transfer knowledge between tasks. For example, learning to play RoboCup Keepaway should contribute to learning the full game of RoboCup soccer. Previous approaches to transfer in Keepaway have focused on transforming the original representation to fit the new task. In contrast, this paper explores the idea that transfer is most effective if the representation is designed to be the same even across different tasks. To demonstrate this point, a bird\u27s eye view (BEV) representation is introduced that can represent different tasks on the same two-dimensional map. For example, both the 3 vs. 2 and 4 vs. 3 Keepaway tasks can be represented on the same BEV. Yet the problem is that a raw two-dimensional map is high-dimensional and unstructured. This paper shows how this problem is addressed naturally by an idea from evolutionary computation called indirect encoding, which compresses the representation by exploiting its geometry. The result is that the BEV learns a Keepaway policy that transfers without further learning or manipulation. It also facilitates transferring knowledge learned in a different domain, Knight Joust, into Keepaway. Finally, the indirect encoding of the BEV means that its geometry can be changed without altering the solution. Thus static representations facilitate several kinds of transfer
- …